Why we blog: Encourage re-use of our intellectual assets
Why we blog: encourage re-use of our intellectual assets
One of my pet ideas is to use
blogs for personal knowledge management (pkm). An argument is that knowledge workers do not like to submit their work to a centralized system because they lose control and accessibility – the codification approach (Desouza, 2003). My argument was that blogs are inherently personal while still allowing for search, sharing, and re-use (supporting the personalization approach, also from (Desouza, 2003).
Another reason that these centralized databases frequently aren’t used is that context is stripped from the artifact, so it becomes a disconnected document.(Desouza & Awazu, 2004). The metadata should capture some of this context and there should be a methodology section in every report (although goodness knows few librarians explicitly write out their search strategies on their research reports… more to come on this) – nevertheless, there’s no history there. Besides, all of this front-end work is expensive and discouraging (mention of this in discussions
of finding vs. refinding and re-finding as use). Blogs tend to provide this context in terms of history, linking, and narrative.
If we want to become an expert or if we have a really good idea that we want to see used (not always the case as mentioned in (Desouza, 2003)), how do we publicize it? If we just enter it into a centralized system, it will get lost. Basically, we have to advocate for the idea. We have to make ourselves available for questions so that we can be known as experts. Also, we have to provide a record of work or a history so that we will be found and trusted. A directory isn’t enough, a resume is better, a knowledge map better still, but wouldn’t a blog be best?
Desouza studies software engineers. It could be that Open Source Software communities like SourceForge capture enough of the process that blogs would be redundant. SourceForge also explicitly enables reuse of code pieces and provides a wiki-like history of the software, documentation, a narrative of the development process, etc. So, a community like SourceForge, when seen as a whole can probably solve a lot of these questions. In fact, it might be better to offer this set of tools, but it’s also a lot more time consuming on both ends. For the non-programmer knowledge worker, a blog might be the best first step.
Incidentally, Jon Udell is looking for comments on his book idea to explore using professional blogs as
combination CV and autobiography. Mine is kind of going that way with the exception that I do try to keep a lot of more personal, motivational, posts out of the blog. I think these things would be necessary in an autobiography. I’d really like to see everyone who publishes self-archive their work using their blogs as pointers/indexes. Talk about context (. That would allow for re-use of our intellectual assets(Davenport, Thomas, & Desouza, 2003).
References
Davenport, T. H., Thomas, R. J., & Desouza, K. C. (2003). Reusing intellectual assets.
Industrial Management, 45(3), 12.
Desouza, K. C. (2003). Barriers to effective use of knowledge management systems in software engineering.
Communications of the ACM, 46(1), 99-101.
Desouza, K. C., & Awazu, Y. (2004). How to put context in the knowledge base.
KM Review, 7(2), 8-9.
Updated for formatting and to add tags
Tags: km, pkm
Notes on Searching the Live Web by Hodder
Mary Hodder, lecture to UC Berkeley's SIMS 141 class, 11/22/2005, available in
rm format. (accessed 12/27/2005)
Live web - blogs wikis, etc., subset of web
She lists - blog pulse, sphere, technorati, bloglines, ice rocket, pub sub
Difference between static web and live web searching
- return of results (pagerank/relevance vs. reverse chronological), emphasis on live
- link searching vs. kw searching (not immediately obvious where search terms appear)
- engines find blogs by underlying structure produced by common blogging s/w (therefore not all retrieved are blogs, not all blogs retrieved)
- things drop off the front page ("aged") in liveweb search vs. google, which keeps archive (slower to crawl, relevance most important, deeper search)
Metrics of blog search
- links (technorati (last 6 mo), pubsub (not explicitly reported in search results), bloglines (forever)), different from site to site, confusing
- number of blogs searched ... bloglines gives #articles, others #feeds
- what are you actually searching? (see her venn diagram at 18:24)
- blogs no feed (~15%?)
- blogs w/feeds
- feeds not blogs
- number of RSS subscribers (in bloglines, feed via feedburner) -- using one or the other to look at influence or reach is inadequate... people try to extrapolate from both figures, using knowledge of subject area and how techie people are in that area
- her proposed metrics (see her blog, 22 different metrics, search only across smaller communities, not the whole blogosphere)
- blog to blog links (not blogroll, bcs on the fly)
- post to post links
- blogroll (decision to make a semipermanent part of the template, different relationship)
- comments
- blogserver records
- incoming traffic (people who read from bookmarks, find from web searches)
- re-order search results by "authority" -- number of links received. Sphere will allow by relevance
Splogs
- 13k blogs in an hour
- Google doesn't work as hard as they might to get rid of because of advertising dollars
What's needed
- (stop comparing everything to google and static web search from 1997)
- sophisticated interfaces
- topic browsing
- sophisticated weighting tools (more than just inbound link counts)
- adjustments to static web search to fine tune it
Her project to
tag w/identityIn response to questions...
Another way to help liveblog search:
- microformats (technorati approach, rel=)
- structured blogging (pubsub approach)
Problem with co-mingling blog links with static web: her example of looking for bank location -- it's not helpful to find blog posts about the bank.
Everything Bad, Gorman, Levy, Liu
Everything Bad, Gorman, Levy, Liu: How we read electronic media
The new
Current Cites (December 2005) (via e-mail) has a review by Leo Robert Klein of an article on reading in digital libraries (Liu, 2005) this connected to several other things I’m looking at right now.
The essence of the argument is that while the digitization of information has enabled powerful tools like hypertext, it has dramatically altered reading for the worse by fragmenting attention, discouraging deep reading, comprehension and retention.
This is what Gorman was complaining about (Gorman, 2005) (in part).
On the other hand, there is
Every Thing Bad is Good For You (Johnson, 2005). I only just started reading it so I can’t give a thorough review of his point of view or argument (in fact, there’s nothing in the cover bio about the author’s qualifications to write such a book and this *may* be important). What I’ve read so far says that narrative in new media is much more complex, greater cognitive dexterity (my words, if they make sense) is required to interact with video games (trivializing the different if you’re just saying fine motor control), and to judge games/internet/media on book standards is wrong/unfair/inaccurate.
Levy’s article is a bit older, but carefully reviews the history of reading and attention. He states that “in reading, the partiality of attention means both that the document itself is selectively attended to and that to which the document points is also only partially grasped. As a process that is linear in time, it is only capable of fixating on one small fragment at any one instant.” (Levy, 1997, 206). He goes on to talk about a change in design of the
New Yorker because readers moved from consuming each issue as a whole, from cover to cover, to only “dipping in.” He talks about disaggregation of books into smaller and smaller chunks in the same way moving images on television have become “sound bites” (see my
discussion of disaggregation of journals by APS and IOP).
He argues that “current work in digital library design and development is participating in a general societal trend toward shallower, more fragmented, and less concentrated reading” (Levy, 1997, 202). Also that “while one might argue that hypertext is integrative because it permits information units to be gathered up and linked together, it is exactly the integration of fragments that it encourages. And at the same time, as larger units, such as journal articles and even books, are put into hypertextual form, the creation of links among their parts contributes to their further fragmentation or atomization.” (Levy, 1997, 208)
Liu’s new article is essentially a literature review of studies of reading in digital environments# (2005). People do browse and scan more documents and look for keywords. Also, more emphasis is placed on putting the main content above the fold or in the lead paragraph. This has been true for a long time in newspapers, but it is being done in scientific literature.
These articles all report what the authors believe is happening without clear scientific evidence of why or if it’s good or not. Anyone who took lots of standardized reading comprehension tests was told to read the questions first, then scan the document for answers and then move on – that’s the best way to get the most correct answers on the SAT and other tests. So, we also do this when the time constraints are self- or world-imposed instead of just test-imposed and we continue to do it even after we’ve finished our last standardized test (the GRE, hopefully). As I keep saying, and Levy also says, how we read is really based on context. I think bloggers like everyone else sometimes read intensively and sometimes scan for facts. In reality, librarians in public service should be *much* better at scanning for facts than the majority of the population – are we really expected to read that whole article from
Physics Review Letters to see if it’s relevant to our customer? Of course not! We scan for keywords many, many times a day.
Final thoughts: I want real scientific work on this area with valid, reproducible results. Librarians are frequent and skilled keyword scanners, but should also be good intensive readers for scholarly, peer-reviewed articles in their own field once the article is identified to be of interest.
#nb: I feel that there are some large problems with the methodology and presentation of results in this paper so will not refer to his results, but use his article as a literature review. 1) he doesn’t provide his survey 2) “sample of convenience” – not clear how participants were selected 3) shows *perceived* answers, not actual (IOW, he doesn’t measure the differential in time spent or any of the other questions, and he doesn’t do a critical incident method – he asks the participants to say what they think has changed over the past ten years… this doesn’t tell you what has changed, rather what they perceive is different – this could be impacted by media, participation in the study… 4) mixes in his ideas and what he’s read in the literature with results – his survey results do not support his statements. For example, he states reasons the participants are spending more time reading online, but the survey asks no questions on this. He is either stating the obvious (so not necessary), or saying more than the survey shows.
References
Gorman, M. (February 15, 2005). Revenge of the blog people.
Library Journal, Retrieved 12/21/2005,
Johnson, S. (2005).
Everything bad is good for you : How today's popular culture is actually making us smarter. New York: Riverhead Books.
Levy, D. M. (1997). I read the news today, oh boy: Reading and attention in digital libraries.
DL '97: Proceedings of the second ACM international conference on digital libraries, Philadelphia, Pennsylvania, United States, 202-211.
Liu, Z. (2005). Reading behavior in the digital environment: Changes in reading behavior over the past ten years.
Journal of Documentation, 61(6), 700-712.
Daser posts, Sunday 12/4/05
Please see original posts on http://asistdaser.tripod.com/daserblog/index.blog?from=20051204
Sunday, 4 December 2005
Feedback, evaluation, wrap-up Mood: chatty Notes by Christina Pikas
Bob Kelly: APS focus groups
ML:
presentations up on the web page
list of participants out to the participants
evaluation of process and structure:
David - usually the best part of a conference is in the hallway, I feel like I've been in the hallway the whole time. This is a really good thing and it made it more interactive and a think tank... good people and the right people for the discussion.
Eating all of our meals together is a good thing. Have roundtable discussion topics at lunch
Good having a single track so we could see everything.
Presentations were easy to follow.
Suggestion -- Charleston model w/action items
Link to presentations. Also fewer speakers with more time to each speaker is better and more useful.
Wasn't well enough advertised.
Liked the tight focus.
Introductions are good.
Get a speaker who's not for open access (ML tried and was turned town)
She was expecting more technical presentations
Where are the blog notes available?
Get questions at meals, then have a program in the evening that addresses those questions.
Different seating arrangements.
Backchannel comms. Better wireless. Better power.
Affordable hotel.
David Stern, Yale: STM Libraries in the Future: Quo Vadis Mood: quizzical Purposeful abandonment
Notes by Christina Pikas
We live in a world of conflicts
we answer to faculty and administration
we love books, but don't buy a lot
mutually exclusive expectations
convenience vs. enhanced navigation
more options creates confusion
customization vs. personalization
administration
economies
industry
standards vs. branding
ease of use across platforms, consistency of icons/metaphors
environmental challenges
federation vs harvesting
package plan vs unbundling items
seamless pre-paid vs. transaction vs. tiered
IR - oa or archival
individual archives or consortial?
competing info resources
google/scholar easy vs. comprehensive
worldcat/sakai
-appropriate tools
-which inspec, holdings for pubmed
Incorporating multi-media
-teaching tools
-large datasets
KM
-personal/lab databases (lab results, local storage of group knowledge, links to published literature)
-data manipulation (not just pdf, repurposing of raw data, permissions)
Facilities
-create quiet rooms
-keyboard noises
-new group study spaces (just higher noise, technology, food/social)
-24x7
Self-archiving
IRs need to deal with unpublished, non-peer reviewed materials as well as peer-reviewed journal articles. (conf proceedings, white papers, technical reports)
searching these distributed archives is far from perfect
hybrid journals aren't well handled by link resolvers
(permissions are handled at the journal level, not at the article level)
strictly preservation archives
-lockss, dmca restrictions on sharing
-unnecessary redundancy/cost
-only saving pdfs (not data, not repurposable)
Portico (quality controls xml downloads from publishers, stores, metasearching, migration possibilities, runs on the JSTOR software)
daser2005Updated: 12/5 to add tag and picture
David L Osterbur: Drop Your Tools and Run Faster Mood: caffeinated Notes by Christina Pikas
Weik 1996 Drop Your Tools: An Allegory for Organizational Studies. Administrative Science Quarterly 41(2):
Brown and Marek 2005 applied this in Library Admin and Mgmt 19(2): 68-74
Explanations for failure (to survive for firefighters in forest fire)
-listening (hearing, taking in, understanding that there's a different perspective)
-justification
-trust
-control
-skill at dropping
-admit failure (if you drop your tools you're admitting failure)
-social dynamics (following the crowd)
-consequences (proof that dropping tools will be a benefit)
-identity (how much is your identity tied up with those tools)
-replacement skill
Replacement skills
-bioinformatics support
big open access area, all of the data is available for free online... yet libraries aren't teaching these tools
-cheminformatics
who has control (luddites in control)
-adding value
Bioinformatics support
-we don't have to pay for, it's out there and used extensively
-no good service model for providing that support
-bioinformatics support groups don't have service mindsets -- they're researchers themselves and are interested in research, not in helping
-like
driving school vs. building the car
-librarians like to search (librarians have a rich source of things to show users that users don't know about)
-libraries need to regain role
Why don't libraries do it?
- "we have a support group" (help the support group, provide a service they don't offer, enough information for everyone)
-no one trained in it
NLM offers regional introductory course
http://www.ncbi.nlm.nih.gov/Class/MLACourse/index.html
advanced course yearly in Bethesda (5 day 9-5 course)
http://www.ncbi.nlm.nih.gov/Class/NAWBIS/index.html
13% of participants in the advanced course have humanities degrees.
-tricks of the trade... free full text textbooks other fab things that you can show in bioinformatics research
example: article re h5n1 increase virulence in mamals... sequence... blast... OR genbank, click on blink look up protein (never use keyword in genbank), gets to the point where he can show the differences between the 1918 pandemic flu virus and the avian flu currently of concern
can draw both sequences in 3d structure and see the differences (rotate, align, etc)
July in J MLA, article telling you how to do this and who is doing it... then just do it
Cheminformatics
ACS biggest provider (*cough* luddites)
Peter Murray-Rust
mass spec, machine readable vs. picture for pub or human understanding... also only give maxima
grad student takes 100 hours to take information away from what he has to
Chemical Markup Language (CML), XML markup... want to make searches for chemicals google-like... semantic grid for chemistry
Value Added
Notre Dame DSpace implementation services offered
Word about digital archives--
we won't be able to migrate fast enough (can't migrate all the data before it has to be migrated again)
Stuart Scheiber Microtome Publishing... ascii text
Conclusion
stay adaptable
from audience:
don't limit your constituency
daser2005Updated: 12/5 to add tag and picture
Michael Leach, Harvard: Whither the collections? Whither the librarian? Mood: caffeinated Notes by Christina Pikas
Series of questions to the audience:
How many here work with S&T collections?
What % of collection development $ are serials?
What about the staff who had been employed to manage print collections?
To publishers, how do you want librarians to work with you?
Jan- free material needs to be cataloged, too. Role for librarians in financial setting - advocate open access to administrators who can pay for publication
Vivian- librarians discuss impact factor to administration, understand how the publishing industry works and work with publishers
Jan - q to librarians:
ARL used to derive status from the size of the budget -- librarians may not want the financial things taken away because it may impact status, is this an impact to you
a: yes... # of volumes (old school), budget collection vs. salary (ratio so that to lower collection budget but to hire librarians to manage free or less expensive eletronic resources would penalize you)
also to move money to other bugets to support authors publishing in oa from collection development budgets does take power away from the libraries -- we won't be selecting materials and providing access -- there will be universal access and all materials will be selected (maybe instruction)
BK: librarians have a key role in organizing information and new finding tools like folksonomies and other social software... especially with the proliferation of freely available information. Open access is a given. Tools to cross disciplines so physicists can see what biologists are saying... Librarians out of the stacks and into the world.
ML: he still sees a lot of his colleagues tied to their physical collections; but the e resources are so rich, so librarians need to be moving toward.
1) think beyond the traditional collections (even if e)
2) work closely with producers of materials (faculty, post-docs, etc), become a support mechanism for these researchers. Help them submit to journals. Help them submit to archives (oa, ir, etc).
3) teaching and advocating. more than information literacy... scholarly communication at all levels
4) google is a good thing. spend less money on OPACs and spend the money elsewhere
5) libraries as a whole do not put enough money into R&D: user needs analysis, develping user interfaces, marketing... this is left to third parties like vendors. (exception Rochester, has a staff anthropologist?!?) we are too passive, meek
audience:
BC: instruction... historically first course in cheminformatics included using the chemical literature but methods of presentation, ethics, intellectual property, knowledge of the prizes/awards... developing information fluency ...
"pardon me, is my eye hurting the end of your umbrella"
PW: library is not based on the physical space, we need to do value-added service, contextual support could be added to the repository... how about forming the citation for me?
T (from SPIRES): a lot of this is already going on, libraries have a tradition, being physically co-located with the library as a programmer is very important
P L-S: our library has already lost complete control of the budget. they are already there becase that's all they do, they don't manage collections
daser2005Updated: 12/5 to add tag and picture
Tim Hays, NIH: NIH Public Access Policy Mood: caffeinated Notes by Christina Pikas
Announced in Feb, implemented in May. In PubMed Central
Goals
- archive of NIH research
- advance science
- access to the public
Driving Forces included Congress, new IT, increasing public use of the internet
56% of internet users bring documents with them when they visit the doctor's office (?!?)
Internal drivers: need archive to study the outcomes of funding efforts, make information available that they paid for on the public's behalf
The policy:
-0-12 month embargo
-peer-reviewed, original research publications, supported in whole or in with direct costs from NIH. Not book chapters, editorials, review, or conference proceedings.
-currently funded (or if accepted for publication after May..)
-does not affect copyright
-authors are encouraged to add a line to their publication agreement that says that they will submit to NIH
-should not effect peer review
-should not affect scientific publishing (1% journals in pubmed have more than 50% of their articles funded by NIH, 10% of the articles in pubmed were funded by NIH)
-has had some positive effect with journals now having a self-archiving policy
(audience comment that Nature is backsliding from 0 month self-archiving to 6 month self-archiving)
Timeline
Final policy 2/3/2005
System released 5/05
New website 10/05
by Feb/06 hope to have a batch upload function (they're working with Elsevier, Wiley, Nature)
Participation
About 2% to 3% (not including the 8% from PubMedCentral journals that are in there by default) as SH said this 10% mirrors the participation world-wide in self-archiving with no policy (or TLC from a librarian :) )
Delay/embargo
- about 60% immediate pub, 20% after 9-12 months
- Have removed 40 articles because of too early publication (from about 10 journals)
Issues with researchers who want/need to get published, but worry that the copyright negotiations may delay or prevent paper acceptance, plus figuring out funder policy, their institution policy...
NIH is doing outreach
Public Policy Working Group (11/15/2005)
Limited survey of 19 health sciences libraries... 87% of faculty were aware of the policy, 4% had submitted
Largest factors
-time
-priority
-confusion over copyright and version
Q from audience:
-make mandatory (this is being considered, but somewhat difficult because part of the regulatory process, also Dr. Zerhouni saw this as a way of changing the landscape -- did not want to cause bad will with groups with whom NIH cooperates)
-group doesn't include open access publishers
Working with publishers
-3rd party submissions
-Elsevier, Nature, Wiley submit directly (they control version, embargo)
-software tool for offline verification of grant numbers
-will post the publisher version over the author version, place links to the publisher site, correct author errors, place links to article correction notificationon the publishers web site
-will have xml and pdfs of all documents
Questions to the working group
1) should participation be mandatory - 12 out of 14 yes (two who said no, Elsevier (no, really?) and FASEB)
2) what should be the embargo?
A variety, many said 6months
3) what is the best version
publisher version, but not clear on whether xml or pdf
Next steps
-continue outreach
-evaluate
-batch uploading
-report to congress
From the audience:
sh: "flawed policy that missed historic opportunity... but can be improved" flaws: voluntary, let the word embargo be said, demand central deposit
should have done: request that the depost be made either in own IR or in PMC, then PMC can harvest automatically from IRs. make it into an instant deposit upon acceptance. build into metadata that shows up immediately - with e-mail to author to request e-print. NIH should also offer to pay reasonable publishing in open access
From Jan V: springer is not on committee, but should be mandatory, no embargo, both xml and pdf, explicitly say on the NIH page that it is OK for open access publishing to be covered by grants (Wellcome trust does this).
From Brad: mandatory is the important thing
From Mary: mandatory needs to happen that will really be transformative
From Paige: impact to publishers... what would it be if all of the large science funders in government (DOD, NASA, ... ) did this
From Peiling: theoretically mandatory is necessary, but getting the regulations in place is a long term thing... what if all new grants from this point forward have that requirement?
Vivian: meeting of publishers hosted by Blackwell where everyone got up and said how horrible this is... unnecessary duplication of what highwire pubs etc are doing...
Is there an analysis that shows if people can get to things that they would not have been able to get to otherwise... IOW, does it really make things available that aren't elsewhere available?
Answer: not enough content, plan to evaluate....
daser2005Updated: 12/5 to add tag and picture
Stevan Harnad, Southampton, U Q at Montreal: (no title) Mood: on fire Notes by Christina Pikas
Data and slides available online and may be reused. There were slight technical difficulties.
Why do researchers publish?
Not for money but to communicate results
Open access is:
free, immediate, permanent, and full text online access
primarily peer reviewed journal articles, theses
Why?
Lawrence 2001, more citations to online articles than offline articles in the same venue (not open access effect)
To what extent were Lawrence's results only a CS effect? The compared OA vs. Non-OA in Astro
(he states that there are basically 12 astro journals, all astronomers are at institutions that cover these 12 at minimum, so to them there is open access) and other physics, sociology and biology.
http://opcit.eprints.org/oacitation-biblio.html
Lots of fast moving graphs here, based on a robot that gathered 1.4 million self-archived articles across about 10 fields (not phys, but does include some social sciences). Grouped articles by # of citations (in bins), then graphed for each bin, #articles for each pub year. Did the same for non-open access articles. Then took the ratio of one to the other, found (I believe and I think he'll correct me) that in general, open access articles are more highly cited across disciplines than non-open access articles.
Dollar value value of citation (a la Diamond 1986 and adjusted to current year money), $85/citation... The UK is losing 300k potential citations and 1.5B GBP based on the above calculations.
Research assessment, research funding, citation impact
All of the factors that are used to evaluate researchers and research groups.... correlate highly with citation counts. Citation counts would better highlight a really good article in a lesser journal.
Changing Citation Behavior
Peak of the curve is moving earlier and earlier. Citations may occur within 3 weeks of self-archiving (!) These charts come from
citebase. Self-archiving has speeded up citation behavior, immediacy, and the movement of physics.
Open access - how?
Archives without an institutional self-archiving policy remain nearly empty.
What prevents us from open access in the form of self-archiving is keystrokes, not copyright
Awareness of author compliance (study?)
81% would willingly comply with self-archiving policy. 5 archives that have mandates are some of the largest (so this works) ... examples CERN, Southamption. University of Tasmania vs. Queensland productivity (?) +archives +librarian assistance +mandate ... (See upcoming D-lib article by Arthur Sale?)
388 institutional archives worldwide (they've found), vast majority are empty. In Germany every institution has an IR, but no policy (and sometimes no tender loving care)
Audience questions
a: Have the successful archives in the above mentioned Australian universities experienced citation effects?
a: early yet, but some
q: Infoglut or version control-
Will there be problems with too many versions... or access to the correct version
a: no, researchers just want the materials. researchers know what they're doing, what's a post-print, what's a pre-print, and what's good literature -- this is not done by librarians but is part of being a professional researcher
Robot -- didn't specify that articles were true OA, just that they were available online fulltext for free at the time of the crawl. Future work may try to address this
Q: about the wording of self-archiving... SH says that it's a supplement to subscription access.
Q: citation life cycle -- doesn't that bias this because articles might be self-archived only after they have proven to be good articles (I may have gotten this wrong sorry D.S.)
SH: they are studying latency, life cycle, immediacy
daser2005Updated: 12/5 to add tag and picture, correctly spell the speakers first name.
Posted by asistdaser at 9:56 AM EST |
post your comment (0) |
link to this post
Daser posts, Saturday 12/3/05
Please see original version on http://asistdaser.tripod.com/daserblog/index.blog?from=20051203
Saturday, 3 December 2005
Peiling Wang, UTK: Research-related Use of Internet-enabled Information Resources Mood: on fire Notes by Christina Pikas
Preliminary study, but she believes that it will scale up. Standard deviation is very small.
Purpose
- identify interdisciplinary differences in the use of internet-enabled information resources for research (not just technology rich or poor, rather, the type and nature of the research)
- identify factors affecting use or nonuse of these resources
- influence design
Research Questions
- which internet information technologies are used in research
- who are these technologies used in information seeking (model of 6 types of information seeking (Ellis?))
- how important are each
Research Design
- indepth f2f interviews
- semi-structured questions (her guide is available ask her)
(for how does each tech type support each of the 6 of Ellis' types, plus one more type: organizing)
What percentage of your needs are met by electronic resources?
Chaining - forward or backward citation searching
Participants
Productive and active researchers (faculty and doctoral students):
Computer Science
Engineering
Information Science
Journalism
Humanities/Social Sciences (not yet complete)
In progress - 42 interviews right now
Preliminary results
- average 5-7
(sorry for the poor table)
importance | cs | eng | is | JEM
|
1 | web | db | db | web
|
2 | email | web | web | opac
|
3 | e-j | ftp | e-j | database
|
4 | dlib | opac | opac | email
|
5 | opac | email | email | e-j
|
What % e-resources? Eng highest, CS next, InfoSci next....
2 outliers
1-CS prof, 100% electronic
2-Journalism prof, 98% print
Factors affecting use:
- nature and type of research
- availablity of digital archives (humanities, historians)
- accessibility of digital archives
- awareness of the resources
- usability of the internet technology
- perception of source quality and reliability
- individual preferences & constraints
strategies
do not save (search again)
do not delete (periodically discard all)
create folders and subfolders
save multiple copies on multiple machines
keep a print copy of the digital documents
work group maintained collection
Implications
- information seeking in the digital age is easier for some but harder for others
- user tools for diverse users
- revamp the metaphor of folders
- provide easy access to digital objects at an atomic level (disaggregation)
daser2005Updated: 12/5 to add tag and picture
Marie Martens, BioMed Central: Open Access, Moving into the Mainstream Mood: a-ok Notes by Christina Pikas
Subject areas embracing open access
- bioinformatics
- cancer
- arthritis
- public health
- infectious diseases
senior authors believe article downloads more credible than citations (?)
(independent study by CIBER: http://www.ucl.ak.uk/ciber/ciber_2005_survey_final.pdf)
"All truth passes through three stages.
First, it is ridiculed.
Second, it is violently opposed.
Third, it is accepted as being self-evident"
Arthur Schopenhauer
daser2005Updated: 12/5 to add tag and picture, correct glaring spelling problem
Karla Hahn, Association of Research Libraries: Institutional Repositories, Emerging Frontiers for Policy Making Mood: not sure Notes by Christina Pikas
She's citing Wikipedia :)
Diffusion of Innovations.
Pattern at which people adopt successful innovations .. Everett Rogers.
We're down at the beginning. Westrienen and Lynch D-Lib June 2005 (limitations on data), table, number of IRs per country, number of docs per IR. In September D-Lib, article by Lynch and Lippincott on US IRs.
(see:
Academic Institutional Repositories: Deployment Status in 13 Nations as of Mid 2005
Gerard van Westrienen, SURF Foundation; and Clifford A. Lynch, Coalition for Networked Information
doi:10.1045/september2005-westrienen
Institutional Repository Deployment in the United States as of Early 2005
Clifford A. Lynch and Joan K. Lippincott, Coalition for Networked Information
doi:10.1045/september2005-lynch)
Other work by Foster and Gibbons, Jan 2005 D-Lib
Three main barriers from Foster and Gibbons articles:
- our language, jargon... users don't know IR, metadata, etc
- time ... to find out about IR, understand why and how to use it...
- copyright
A la Clifford Lynch, IRs are sets of services, not softwares
"Never forget posterity when divising a policy. Never think of posterity when making a speech." Robert Menzies, former Prime Minister of Australia
Policies
Copyright
- authors do not understand their rights, options
- publishers encourage authors to regard as pro forma that they transfer all rights to the publisher
- practices are not consistent among authors, publishers
Peer review
- chicken - egg, get content to look at quality, look at quality to get content
- this is more than just being peacocks, it's their bread and butter, life and death of their careers
New models for scientific works
- MIT CogNet
- Real Climate
- Columbia Earthscape
Digital data
- more on long-lived data (mentioned at ASIS&T, read document here)
- data management plans
Commercialization and content control
- previously, limiting access to make money
- we are not home free
Investment - who pays?
Threats
- underinvestment (investment in science scholarly communication systems has not kept pace with funding in science... can't keep cancelling journals to build repositories, that is not sustainable)
- copyright over-management, under-management
- commercialization
Opportunities
- good that we've jumped on this in new and potentially risky roles, and taken this as a job for librarians
Comment from the audience
- tension also exists between roles as editors, authors, researchers (within the same person)
daser2005Updated: 12/5 to add tag and picture
Leslie Johnston, UVa: Repository Development at the University of Virginia Library Mood: sharp Notes by Christina Pikas
She is discussing a curated digital library. It's been around since 2003.
Fedora: Flexible Extensible Digital Object Repository Architecture
Not an out-of-the-box repository, it's the underlying toolkit that is a Digital Asset Management architecture (Mellon funded, UVa and Cornell, for the software development, but not for their implementation)
Assumptions
- part of a global network of repositories
- all media types
- searching and browsing equally important
- curated
- primary users UVa community, they do have restricted content
- they'd like to have all digital collections in this repository
Process
Phase I, 2003, prototype
- electronic texts from the library's special collections
- art architecture ...
- got a lot of feedback (130 comments), they categorized, ranked, prioritized them
- number one comment: have more stuff
Phase II, Fall 2004 (final for Fall 2006)
See
her article in D-lib for information on testing.
What did it take?
Standards
- ad hoc group documented production standards for media files
- metadata steering group documented local encoding practice, minimum standards, mapped various standards to the local standard
- community digitization standards
Content production
- subject librarians select, with technical assessment (ease of production, need for metadata enrichment, time constraints such as instructional deadlines, funding)
- centralized digital library production service (w/7.5 FTE plus student "scanning monkeys")
- new software tools and scripts
Development
- working groups for functional requirements
- functional requirements and analysis of media files and metadata to document content models (classes of objects and behaviors and mechanisms)
- processes for ingest
- interface
- search
Technology
Developers
- they had no budget
- they borrowed people from other parts of the library
Library Content
- huge queue of stuff to be done
- science stuff (herbarium images, glass astro slides from the parallax project)
Faculty Content
- born digital
- digital humanities projects
Support- librarians, programmers
D. Stern - seems intimidating, but it only took 3 programmers 2 weeks to be able to add info
Marie Martens, BioMed Central: Open Repository Mood: spacey (spacey as in dspace :) )
Notes by Christina Pikas
They are basically a hosting facility for Dspace for institutions that can't host in-house (openrepository.com).
Services
- complete set up within 3 months
- technical support
- hosting
In-house solution has a lot of hidden costs. They have predictable costs -- set-up and maintenance.
The hierarchy is communities and collections. She then went through what it looks like to upload papers and showed an example implementation at UConn.
According to Stevan Harnad, the trouble isn't setting up the software, it's getting the content. (he talked about http://www.eprints.org/, his product, which also offers hosting)
Question from Vivian Siegal - would I have to submit three times - to the journal, to the NIH repository, to my institution? -- no they have automatic feeds.
Mary Steiner, Penn Library: The Toddler Years for ScholarlyCommons@Penn Mood: chillin' I've been forgetting to sign my posts so I will do it at the start and re-edit the others.
Notes from Christina Pikas
Why get into this business anyway?
(partial list, changes over discipline and over time)
- access to information, highlight student work, stable archives
- specific desire of a campus unit (at Penn, Eng.)
- campus climate (centralized, funding structure, top down vs. grassroots, relationship between IT/library/archives)
- improve visibility/stature of academic research and scholarly activity
Scope and Nature
- short term, where to start, pilot, seeding the repository
- long term, across campus, cooperation
- pilot: the best work from the School of Engineering & Applied Science, recent work, with the endorsement of the leadership
Implementation
- oversight (IT, cataloger, administration, license/copyright person from serials)
- they chose the PQ product to get a turn-key solution with less burden on library and university IT
- f/t only
- they took on the burden of copyright compliance (to encourage submissions without reading through and complying with extensive publisher-specific policies)
(according to Stevan Harnad, you can just put everything up, and make f/t only available to the institution where required, then provide author e-mail and suggest that outsiders send e-mail asking for e-print, then author can send an e-print from the repository)
Operations -- getting content
- submit via e-mail
- harvesting faculty web pages
- alerts on relevant databases, e-mail authors
Going public
- timed the launch (not over break)
- marketed via demos, write-ups, mailings, links, share statistics
- registered with search engines
Assessment
daser2005Updated: 12/5 to add tag and picture
Jeff Riedel, ProQuest: Alternative Models of Scholarly Communication Mood: caffeinated Notes by Christina Pikas
Back after the break. Change in order, we're now on to sessions again...
Digital Commons is their product... current implementations, UConn, TexasTech....
Other products eprints at SOTON and Dspace. About 500 institutions now have institutional repositories. See http://www.oaister.org for a sample.
Right now
- majority of objects still text-based
- some disciplines are more likely than others
- discovery paths (75% general web search engines -- going right to the paper, 7% front door, 19% referral/e-mail/direct access). Google acces starts at about 90% and then drops to about 40% years later when the IR is established. OAI has very few referrals -- they've done a good job marketing to producers but not to users.
More on marketing OAI
- more content
- better tools can be built on oai, but haven't yet
- needs to be a part of federated search implementations
Challenges and answers
- content recruitment
- no pain for the researchers (yet!), they have to know that it will benefit them first
- regular e-mail reports (your paper has been downloaded x times)
- branded personal researcher pages (contribute to their egosystem)
- citation harvesting
Their product includes a journal publisher module with peer review management, etc.
- see Boston College,
Studies in Christian-Jewish Relations (open access)
The overlap of IR and OA
- Both eliminate costs with accessing scholarly info (really move the cost)
- Possible because of the internet
Bottom line-
This will not help the stranglehold publishers have on institutions (yep!) -- institutions still pay, maybe not the library.
"There's money in the system. You can move it around, but it can't disappear without a quality loss"
daser2005Updated: 12/5 to add tag and picture and to sign the top
Vivian Siegel: Building an Open Access World Notes by Christina Pikas
What it's like to launch an open access journal (she was at
PLOS).
PLOS uses Berlin definition of Open Access -- not only to read, but to reuse content.
Why open access?
- matches the needs to the researchers as readers and authors
- matches the goals of the funders of the research
- best meets the publishing mandate to widely and rapidly disseminate information
Starting vs. Transitioning to OA
developing reputation vs. established reputation
building submissions, readership, usage (same as with any new journal) vs. established
no legacy data concerns vs. legacy data
can set expectations vs. legacy economics
Additional challenges at PLOS
Building an organization
Possible because of...
- philanthropy
- credibility within scientific community
- support from scientific and library communities
To build submissions...
- Kept an updated list of authors who had papers accepted
- Impact factor!
How to do you fund...
- philanthropy
- author charges
- memberships
- advertising
- commercial reprints
- a la BMJ - value-added material subscribers only, research open access
- print supports online
Costs to publishing
- copy editing
- figure manipulation
- professional editors
- front section
- fee waivers
How do you build an open access world?
- have open access options for traditional journals (like the PNAS model)
- front section subscription only (BMJ)
- put open access in researchers evaluation model
- put aside the money in the funding so that it can't be spent elsewhere.
- reduce the costs of publication (print on demand, more control and responsibility to authors for copy editing and figure manipulation, better open source publication management software)
daser2005Updated: 12/5 to add tag and picture, signature
Bob Kelly, APS: Expanding Readership in Theoretical and Experimental Physics Mood: bright Notes by Christina Pikas
Anton Chekhov, "there is no national science just as there is no national multiplication table, what is national is no longer science..."
History:
1994 APS workshop: publishing preprints on the web is not pre-publishing for submission purposes
Dropped page charges (for all but PRL)
1995 NRL first institutional repository (TORPEDO)
1998 Physical Review Special Topics Accelerators and Beams -- did not charge, libraries reluctant to catalog since they weren't paying for it.
They (like IOP) are moving away from journals towards articles and collections and integrated collections. (note: this has a lot of implications in a lot of spheres, I believe we discussed this at a PAM session at SLA WRT IOP)
He compared institutional subscriptions (now) with the way it was in the late '90s: individual subs, group/department subs, and library subs. Individual subscriptions have doubled in cost, but in general institutions are paying less because access is available across the institution with the online and with the single print copy in the library. hm.
New Services
- ENTS: essential non-text stuff (ex. holograms)
- RSS
- wikis
- folksonomies
- blogs
- full text xml, single column pdf
Copyright statement - has been evolving, but according to Stevan Harnad (audience) has been a step ahead and was the first green publisher (meaning allows self-archiving, etc)
They're looking at non-traditional revenue sources such as archives back to 1893.
daser2005Updated: 12/5 to add tag and picture, signature
Steve Moss, IOP: Open Access Perspective from an STM publisher Mood: bright Notes by Christina Pikas
He introduced IOP.
IOP's Open Access Initiatives
1) This month's papers (current 30 days' worth of papers available at no charge on the web page)
2) IOP Select (editorial boards select the best and highest impact papers and make them available at no cost)
3) New Titles (are freely available for up to 2 years)
4) Developing Countries
5) Open Access Journals (New Journal of Physics, Conference Series, Journal of High Energy Physics)
These efforts
- have not increased cancellations
- downloads and submissions are up
- upgrades and subscriptions are up
- has helped impact factors (10 titles more than 20%)
- 18% of the downloads are free, not to subscribers
Expect to break even for a year in 2007 (started publishing in 1998) ... sufficient impact, viable only in long term, requires deep pockets to get there.
See presentation online to see extracts from recent studies on why authors choose to publish where they do. For one study, see
Mary Waltham's site. For another see
APLSP.
daser2005Updated: 12/5 to add tag and picture, signature
Introductions... Mood: caffeinated Notes by Christina Pikas
We're now doing introductions. There are people here from IOP, APS, AIP (so physics is well represented), Springer, Biomed Central, and other academic, governmental, and corporate organizations.
High energy physics, medicine, astronomy, chemistry, and engineering environments are all represented.
daser2005Updated: 12/5 to add tag and picture, signature